241 research outputs found

    Dynamics of heuristic optimization algorithms on random graphs

    Full text link
    In this paper, the dynamics of heuristic algorithms for constructing small vertex covers (or independent sets) of finite-connectivity random graphs is analysed. In every algorithmic step, a vertex is chosen with respect to its vertex degree. This vertex, and some environment of it, is covered and removed from the graph. This graph reduction process can be described as a Markovian dynamics in the space of random graphs of arbitrary degree distribution. We discuss some solvable cases, including algorithms already analysed using different techniques, and develop approximation schemes for more complicated cases. The approximations are corroborated by numerical simulations.Comment: 19 pages, 3 figures, version to app. in EPJ

    Selection of sequence motifs and generative Hopfield-Potts models for protein familiesilies

    Full text link
    Statistical models for families of evolutionary related proteins have recently gained interest: in particular pairwise Potts models, as those inferred by the Direct-Coupling Analysis, have been able to extract information about the three-dimensional structure of folded proteins, and about the effect of amino-acid substitutions in proteins. These models are typically requested to reproduce the one- and two-point statistics of the amino-acid usage in a protein family, {\em i.e.}~to capture the so-called residue conservation and covariation statistics of proteins of common evolutionary origin. Pairwise Potts models are the maximum-entropy models achieving this. While being successful, these models depend on huge numbers of {\em ad hoc} introduced parameters, which have to be estimated from finite amount of data and whose biophysical interpretation remains unclear. Here we propose an approach to parameter reduction, which is based on selecting collective sequence motifs. It naturally leads to the formulation of statistical sequence models in terms of Hopfield-Potts models. These models can be accurately inferred using a mapping to restricted Boltzmann machines and persistent contrastive divergence. We show that, when applied to protein data, even 20-40 patterns are sufficient to obtain statistically close-to-generative models. The Hopfield patterns form interpretable sequence motifs and may be used to clusterize amino-acid sequences into functional sub-families. However, the distributed collective nature of these motifs intrinsically limits the ability of Hopfield-Potts models in predicting contact maps, showing the necessity of developing models going beyond the Hopfield-Potts models discussed here.Comment: 26 pages, 16 figures, to app. in PR

    Typical solution time for a vertex-covering algorithm on finite-connectivity random graphs

    Full text link
    In this letter, we analytically describe the typical solution time needed by a backtracking algorithm to solve the vertex-cover problem on finite-connectivity random graphs. We find two different transitions: The first one is algorithm-dependent and marks the dynamical transition from linear to exponential solution times. The second one gives the maximum computational complexity, and is found exactly at the threshold where the system undergoes an algorithm-independent phase transition in its solvability. Analytical results are corroborated by numerical simulations.Comment: 4 pages, 2 figures, to appear in Phys. Rev. Let

    Threshold values, stability analysis and high-q asymptotics for the coloring problem on random graphs

    Full text link
    We consider the problem of coloring Erdos-Renyi and regular random graphs of finite connectivity using q colors. It has been studied so far using the cavity approach within the so-called one-step replica symmetry breaking (1RSB) ansatz. We derive a general criterion for the validity of this ansatz and, applying it to the ground state, we provide evidence that the 1RSB solution gives exact threshold values c_q for the q-COL/UNCOL phase transition. We also study the asymptotic thresholds for q >> 1 finding c_q = 2qlog(q)-log(q)-1+o(1) in perfect agreement with rigorous mathematical bounds, as well as the nature of excited states, and give a global phase diagram of the problem.Comment: 23 pages, 10 figures. Replaced with accepted versio

    A variational description of the ground state structure in random satisfiability problems

    Full text link
    A variational approach to finite connectivity spin-glass-like models is developed and applied to describe the structure of optimal solutions in random satisfiability problems. Our variational scheme accurately reproduces the known replica symmetric results and also allows for the inclusion of replica symmetry breaking effects. For the 3-SAT problem, we find two transitions as the ratio α\alpha of logical clauses per Boolean variables increases. At the first one αs≃3.96\alpha_s \simeq 3.96, a non-trivial organization of the solution space in geometrically separated clusters emerges. The multiplicity of these clusters as well as the typical distances between different solutions are calculated. At the second threshold αc≃4.48\alpha_c \simeq 4.48, satisfying assignments disappear and a finite fraction B0≃0.13B_0 \simeq 0.13 of variables are overconstrained and take the same values in all optimal (though unsatisfying) assignments. These values have to be compared to αc≃4.27,B0≃0.4\alpha_c \simeq 4.27, B_0 \simeq 0.4 obtained from numerical experiments on small instances. Within the present variational approach, the SAT-UNSAT transition naturally appears as a mixture of a first and a second order transition. For the mixed 2+p2+p-SAT with p<2/5p<2/5, the behavior is as expected much simpler: a unique smooth transition from SAT to UNSAT takes place at αc=1/(1−p)\alpha_c=1/(1-p).Comment: 24 pages, 6 eps figures, to be published in Europ. Phys. J.

    From principal component to direct coupling analysis of coevolution in proteins: Low-eigenvalue modes are needed for structure prediction

    Get PDF
    Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analysis (DCA), a global inference method based on the maximum entropy principle, which aims at predicting residue-residue contacts. In this paper, inspired by the statistical physics of disordered systems, we introduce the Hopfield-Potts model to naturally interpolate between these two approaches. The Hopfield-Potts model allows us to identify relevant 'patterns' of residues from the knowledge of the eigenmodes and eigenvalues of the residue-residue correlation matrix. We show how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA. This dimensional reduction allows us to avoid overfitting and to extract contact information from multiple-sequence alignments of reduced size. In addition, we show that low-eigenvalue correlation modes, discarded by PCA, are important to recover structural information: the corresponding patterns are highly localized, that is, they are concentrated in few sites, which we find to be in close contact in the three-dimensional protein fold.Comment: Supporting information can be downloaded from: http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.100317

    Towards finite-dimensional gelation

    Full text link
    We consider the gelation of particles which are permanently connected by random crosslinks, drawn from an ensemble of finite-dimensional continuum percolation. To average over the randomness, we apply the replica trick, and interpret the replicated and crosslink-averaged model as an effective molecular fluid. A Mayer-cluster expansion for moments of the local static density fluctuations is set up. The simplest non-trivial contribution to this series leads back to mean-field theory. The central quantity of mean-field theory is the distribution of localization lengths, which we compute for all connectivities. The highly crosslinked gel is characterized by a one-to-one correspondence of connectivity and localization length. Taking into account higher contributions in the Mayer-cluster expansion, systematic corrections to mean-field can be included. The sol-gel transition shifts to a higher number of crosslinks per particle, as more compact structures are favored. The critical behavior of the model remains unchanged as long as finite truncations of the cluster expansion are considered. To complete the picture, we also discuss various geometrical properties of the crosslink network, e.g. connectivity correlations, and relate the studied crosslink ensemble to a wider class of ensembles, including the Deam-Edwards distribution.Comment: 18 pages, 4 figures, version to be published in EPJ
    • …
    corecore